A Comparison of Suffix Tree based Indexing and Search Techniques for Querying Protein Structures
نویسنده
چکیده
Biological research comes across different protein structures inside a cell which may be required to map to known proteins to quickly determine their functionality. Efficient techniques for searching a protein structure in a database containing all the known proteins are needed to classify the protein and predict its function. Comparing the structure of unknown protein individually with every protein in the database can be highly inefficient. Database indexing methods are therefore best suited for matching an unknown protein structure with the existing set of proteins. Various indexing techniques have been proposed till date that uses various features of protein structure. Indexing techniques based on suffix tree data structure are of prime importance as they provide efficient querying algorithms. In this report we are going to do a comparison of three such techniques that extracts features from protein structures, encode them to create a sequence, build a suffix tree index and provide algorithm to query the index for unknown protein.
منابع مشابه
Geometric Suffix Tree: A New Index Structure for Protein 3-D Structures
Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structures. For strings, there are many efficient indexing structures such as suffix trees, but it has been consid...
متن کاملیک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کاملStructures of String Matching and Data Compression
This doctoral dissertation presents a range of results concerning efficient algorithms and data structures for string processing, including several schemes contributing to sequential data compression. It comprises both theoretic results and practical implementations. We study the suffix tree data structure, presenting an efficient representation and several generalizations. This includes augmen...
متن کاملIndexing Methods for Protein Tertiary and Predicted Structures
This thesis focuses on the problem of fast sub-structure search and remote homology detection in proteins by finding similar (sub)structures. That is, for a given query protein and a large database of protein structures, we want to retrieve all the similar structures from the database rapidly. With the growing number of proteins deposited in the database, searching the database is a difficult a...
متن کاملPersistency in Suffix Trees with Applications to String Interval Problems
The suffix tree has proven to be an invaluable indexing data structure, which is widely used as a building block in many applications. We study the problem of making a suffix tree persistent. Specifically, consider a streamed text T where characters are appended to the beginning of the text. The suffix tree is updated for each character appended. We wish to allow access to any previous version ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010